-
Notifications
You must be signed in to change notification settings - Fork 761
Fix OTLP gRPC log exporter request wrapper and timeout precedence #4815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix OTLP gRPC log exporter request wrapper and timeout precedence #4815
Conversation
This commit fixes a critical bug in the OTLP gRPC log exporter where `_translate_data` was returning raw `ResourceLogs` instead of wrapping them in `ExportLogsServiceRequest`. This caused gRPC export calls to fail with a type error. Additionally, this fixes the `timeout` argument initialization. Previously, passing `timeout=0` would be treated as None due to boolean evaluation; it now checks `if timeout is not None` to correctly respect explicit zero timeouts.
|
|
| self, data: Sequence[ReadableLogRecord] | ||
| ) -> ExportLogsServiceRequest: | ||
| return encode_logs(data) | ||
| return ExportLogsServiceRequest(resource_logs=encode_logs(data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about that:
def encode_logs(
batch: Sequence[ReadableLogRecord],
) -> ExportLogsServiceRequest:
return ExportLogsServiceRequest(resource_logs=_encode_resource_logs(batch))
Do tests pass?
| credentials=credentials, | ||
| headers=headers or environ.get(OTEL_EXPORTER_OTLP_LOGS_HEADERS), | ||
| timeout=timeout or environ_timeout, | ||
| timeout=timeout if timeout is not None else environ_timeout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct but please add a test and also update it for all the signals. Said that OTLPExporterMixin has the same issue so without updating that too this won't change behavior. Also no idea how the grpc client will behave.
|
Thanks a lot @xrmx sir for the review! Before I continue, can you please confirm if this plan is correct? Apply the timeout is not None logic across all OTLP exporters (and the mixin if needed) + add tests for None, 0, and positive values. Revert the encode_logs double-wrap and return the original value as before. Thanks again for your guidance! 🙏 |
| credentials=credentials, | ||
| headers=headers or environ.get(OTEL_EXPORTER_OTLP_LOGS_HEADERS), | ||
| timeout=timeout or environ_timeout, | ||
| timeout=timeout if timeout is not None else environ_timeout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For both requests and gRPC when a timeout is not set, they will wait forever for the server:
https://grpc.io/docs/guides/deadlines/#deadlines-on-the-client
https://requests.readthedocs.io/en/latest/user/advanced/#timeouts
I don't think we want that ? I think we should always set some upper bound..
An explicit timeout of 0 will cause an error from python requests library. I did not try it for grpc but it might be the same, have you tried it ?
My thinking is we should update the code to explicitly disallow (or ignore?) a timeout of 0, and also to update the code to set some upper bound on the timeout.. WDYT?
If you actually wanted this change you would also have to update this logic in all the exporters (we have similar logic for HTTP):
Description
This commit fixes a critical bug in the
OTLPLogExporterwhere_translate_datawas returning rawResourceLogsinstead of wrapping them in the expectedExportLogsServiceRequestenvelope. This previously caused gRPC export calls to fail with a type error.Additionally, this fixes a logic flaw in
__init__regarding thetimeoutargument. It changes the assignment to checkif timeout is not None, ensuring that an explicittimeout=0is respected rather than being overridden by environment variables.Fixes #4814
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
OTLPLogExporterand verify that_translate_datanow correctly returns an instance ofExportLogsServiceRequestinstead of a raw list.OTLPLogExporter(timeout=0)results in the instance having atimeoutof0, whereas previously it defaulted toNoneor the environment variable.Does This PR Require a Contrib Repo Change?
Checklist: